9 research outputs found

    Finding Subcube Heavy Hitters in Analytics Data Streams

    Full text link
    Data streams typically have items of large number of dimensions. We study the fundamental heavy-hitters problem in this setting. Formally, the data stream consists of dd-dimensional items x1,…,xm∈[n]dx_1,\ldots,x_m \in [n]^d. A kk-dimensional subcube TT is a subset of distinct coordinates {T1,⋯ ,Tk}⊆[d]\{ T_1,\cdots,T_k \} \subseteq [d]. A subcube heavy hitter query Query(T,v){\rm Query}(T,v), v∈[n]kv \in [n]^k, outputs YES if fT(v)≥γf_T(v) \geq \gamma and NO if fT(v)<γ/4f_T(v) < \gamma/4, where fTf_T is the ratio of number of stream items whose coordinates TT have joint values vv. The all subcube heavy hitters query AllQuery(T){\rm AllQuery}(T) outputs all joint values vv that return YES to Query(T,v){\rm Query}(T,v). The one dimensional version of this problem where d=1d=1 was heavily studied in data stream theory, databases, networking and signal processing. The subcube heavy hitters problem is applicable in all these cases. We present a simple reservoir sampling based one-pass streaming algorithm to solve the subcube heavy hitters problem in O~(kd/γ)\tilde{O}(kd/\gamma) space. This is optimal up to poly-logarithmic factors given the established lower bound. In the worst case, this is Θ(d2/γ)\Theta(d^2/\gamma) which is prohibitive for large dd, and our goal is to circumvent this quadratic bottleneck. Our main contribution is a model-based approach to the subcube heavy hitters problem. In particular, we assume that the dimensions are related to each other via the Naive Bayes model, with or without a latent dimension. Under this assumption, we present a new two-pass, O~(d/γ)\tilde{O}(d/\gamma)-space algorithm for our problem, and a fast algorithm for answering AllQuery(T){\rm AllQuery}(T) in O(k/γ2)O(k/\gamma^2) time. Our work develops the direction of model-based data stream analysis, with much that remains to be explored.Comment: To appear in WWW 201

    Reinforcement Knowledge Graph Reasoning for Explainable Recommendation

    Full text link
    Recent advances in personalized recommendation have sparked great interest in the exploitation of rich structured information provided by knowledge graphs. Unlike most existing approaches that only focus on leveraging knowledge graphs for more accurate recommendation, we perform explicit reasoning with knowledge for decision making so that the recommendations are generated and supported by an interpretable causal inference procedure. To this end, we propose a method called Policy-Guided Path Reasoning (PGPR), which couples recommendation and interpretability by providing actual paths in a knowledge graph. Our contributions include four aspects. We first highlight the significance of incorporating knowledge graphs into recommendation to formally define and interpret the reasoning process. Second, we propose a reinforcement learning (RL) approach featuring an innovative soft reward strategy, user-conditional action pruning and a multi-hop scoring function. Third, we design a policy-guided graph search algorithm to efficiently and effectively sample reasoning paths for recommendation. Finally, we extensively evaluate our method on several large-scale real-world benchmark datasets, obtaining favorable results compared with state-of-the-art methods.Comment: Accepted in SIGIR 201

    ABSent: Cross-Lingual Sentence Representation Mapping with Bidirectional GANs

    Full text link
    A number of cross-lingual transfer learning approaches based on neural networks have been proposed for the case when large amounts of parallel text are at our disposal. However, in many real-world settings, the size of parallel annotated training data is restricted. Additionally, prior cross-lingual mapping research has mainly focused on the word level. This raises the question of whether such techniques can also be applied to effortlessly obtain cross-lingually aligned sentence representations. To this end, we propose an Adversarial Bi-directional Sentence Embedding Mapping (ABSent) framework, which learns mappings of cross-lingual sentence representations from limited quantities of parallel data

    CAFE: Coarse-to-Fine Neural Symbolic Reasoning for Explainable Recommendation

    Full text link
    Recent research explores incorporating knowledge graphs (KG) into e-commerce recommender systems, not only to achieve better recommendation performance, but more importantly to generate explanations of why particular decisions are made. This can be achieved by explicit KG reasoning, where a model starts from a user node, sequentially determines the next step, and walks towards an item node of potential interest to the user. However, this is challenging due to the huge search space, unknown destination, and sparse signals over the KG, so informative and effective guidance is needed to achieve a satisfactory recommendation quality. To this end, we propose a CoArse-to-FinE neural symbolic reasoning approach (CAFE). It first generates user profiles as coarse sketches of user behaviors, which subsequently guide a path-finding process to derive reasoning paths for recommendations as fine-grained predictions. User profiles can capture prominent user behaviors from the history, and provide valuable signals about which kinds of path patterns are more likely to lead to potential items of interest for the user. To better exploit the user profiles, an improved path-finding algorithm called Profile-guided Path Reasoning (PPR) is also developed, which leverages an inventory of neural symbolic reasoning modules to effectively and efficiently find a batch of paths over a large-scale KG. We extensively experiment on four real-world benchmarks and observe substantial gains in the recommendation performance compared with state-of-the-art methods.Comment: Accepted in CIKM 202

    Semantic Search with Information Integration

    No full text
    Since the search engine was first released in 1993, the development has never been slow down and various search engines emerged to vied for popularity. However, current traditional search engines like Google and Yahoo! are based on key words which lead to results impreciseness and information redundancy. A new search engine with semantic analysis can be the alternate solution in the future. It is more intelligent and informative, and provides better interaction with users.        This thesis discusses the detail on semantic search, explains advantages of semantic search over other key-word-based search and introduces how to integrate semantic analysis with common search engines. At the end of this thesis, there is an example of implementation of a simple semantic search engine
    corecore